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This primer should provide the reader with a deeper understanding of the concept of test validity and will 
present the recent available validity evidence on the relationship between SAT® scores and important 
college outcomes. In addition, the content examined on the SAT will be discussed as well as the 
fundamental attention paid to the fairness of SAT scores for all students. 

Introduction 


Test validity refers to the degree to which evidence exists to support the interpretation of test scores for 
particular purposes. It is important to note that we validate a test score for a particular use (e.g., 
admission, placement) and that validity is not the property of a test in and of itself. This means that as 
opposed to talking about a test as simply valid or not valid, you should instead state, for example, “There 
is a great deal of validity evidence to support the use of SAT scores for college admission decisions.” This 
also represents the notion that validity is a matter of degree and not absolute. It is therefore very 
important to gather validity evidence over time to either enhance or contradict previous findings. 


There are various sources of validity evidence that can be examined. With regard to the SAT, these 
sources of evidence may include the content tested (e.g., subject area and types of items), the internal 
structure of the test (e.g., reliability and other psychometric properties), and relationships between the 
test scores and other variables (e.g., correlations with the outcomes the test is expected to predict). In 
order to appropriately capture and respond to the inquiries and demands of test-takers, test users in 
higher education, the media, and the general public, the College Board has focused much of its validity 
research efforts on examining the relationship between the SAT and measures of college success. 1 This 
document will provide an overview of the validity evidence available on the current SAT (introduced in 
March 2005), focusing on the evidence supporting the use of SAT scores in college admission decisions. 
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Validity Evidence Relating SAT® Scores to College Outcomes 


Over the last seven years, the College Board has collected higher education outcome data from four-year 
institutions to document evidence of the validity of the SAT for use in college admission. Research has 
examined the relationship between SAT scores and outcomes such as first-year grade point average 
(FYGPA), cumulative GPA through college, English course grades, mathematics course grades, retention 
at different points in time, and college completion in four and six years. The research that follows 
provides a substantial amount of validity evidence to support the use of SAT scores in college admission. 


Much of the validity evidence documenting the relationship between SAT scores and outcomes such as 
FYGPA, for example, is represented as correlation coefficients. A correlation coefficient is one way of 
describing the linear relationship between two measures. 2 Correlations range from -1 to +1, with a perfect 
positive correlation (+1.00) indicating that a top-scoring person on test 1 would also be the top-scoring 
person on test 2, and the second-best scorer on test 1 would also be the second-best scorer on test 2, and so 
on through the poorest performing person on both tests. A correlation of zero would indicate no 
relationship at all between test 1 and test 2. An often-cited rule of thumb for interpreting correlation 
coefficients 3 is that a small correlation has an absolute value of approximately. 10; a medium correlation 
has an absolute value of approximately .30; and a large correlation has an absolute value of 
approximately .50 or higher. Validity coefficients in educational and psychological testing are rarely 
above .30. 4 Although this value may sound low to people without a detailed understanding of correlation 
coefficients, it may be helpful to consider the correlation coefficients representing other more familiar 
relationships in our lives. For example, the association between a major league baseball player’s batting 
average and his success in getting a hit in a particular instance at bat is .06, the correlation between 
antihistamines and reduced sneezing and runny nose is .11, and the correlation between prominent movie 
critics’ reviews and box office success is .17. 5 The uncorrected, observed, or raw 1 correlation coefficient 
representing the relationship between the SAT and FYGPA tends to be in the mid .30s. When corrected 
for restriction of range 6 , the correlation coefficient tends to be in the mid .50s, representing a strong 
relationship. This is about the same or higher than the predictive validity of graduate admission exams 
studied in a paper 7 published in Science, where corrected correlation coefficients across seven exams with 
graduate school FYGPA ranged from .41 for the Graduate Record Examination Total (GRE), Graduate 
Management Admission Test (GMAT), and Miller Analogies Test (MAT), to .59 for the Medical College 
Admission Test (MCAT). In that study, only the MCAT-FYGPA relationship would be considered stronger 


i. Raw, as opposed to corrected, for restriction of range, which factors in the reduced variance in the predictor and criterion 
resulting from only analyzing the higher SAT scores and FYGPAs available for the admitted/enrolled students instead of all 
applicants. Note that it is a widely accepted practice to statistically correct correlation coefficients for restriction of range since 
only a sample (admitted/enrolled students) is available for analysis as opposed to the population (all applicants) for which the 
measure (SAT) was used to make decisions. 
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than the SAT-FYGPA relationship. The results of national SAT validity studies examining various 
outcomes of interest follow. 11 

First-Year Grade Point Average ( FYGPA ) 

The SAT and high school grade point average (HSGPA) are strong predictors of FYGPA, with the multiple 
correlation™ (SAT & HSGPA — > FYGPA) typically in the mid .60s. 8 The results are consistent across 
multiple entering classes of first-year, first-time students (from 2006 to 2010), providing further validity 
evidence for the SAT in terms of the generalizability of the results. In addition, the SAT provides 
incremental validity above and beyond HSGPA in the prediction of FYGPA. Figure 1 displays the 
correlations of SAT, HSGPA, and the combination of SAT and HSGPA with FYGPA for the 2006 through 
2010 entering first-year cohorts. The results clearly show that both SAT scores and HSGPA are each 
strong predictors of FYGPA, with correlations in the mid .50s. Moreover, the figure clearly shows the 
added benefit of using the combination of SAT scores and HSGPA because that combination yields the 
highest predictive validity (i.e., the green line is the highest). Using the two measures together to predict 
FYGPA is more powerful than using either HSGPA or SAT scores on their own because they each 
measure slightly different aspects of a student's achievement. 9 


ii. The samples analyzed in the College Board’s most recent SAT validity studies are most typically based on 110-160 four- year 
institutions that are diverse with regard to control (public versus private), size, selectivity, and region of the country. For 
additional information on the samples of institutions and students analyzed in each study, please refer to: Mattern, K.D., & 
Patterson, B.F. (2014). Synthesis of recent SAT validity findings: Trend data over time and cohorts (College Board Research 
Report in Review 2014-1). New York: The College Board. 

iii. Unless otherwise noted, the SAT and HSGPA correlations reported in this document were computed within institution, 
corrected for range restrictions, and aggregated, weighted by their respective sample size. 
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Figure 1. Correlations of HSGPA and SAT with FYGPA (2006-2010 cohorts). 10 
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As was previously mentioned, the correlation coefficient is not always the most straightforward way to 
think of a relationship between two variables. Therefore, another way of considering the incremental 
validity of the SAT over and above HSGPA to predict FYGPA is presented in Figure 2, 11 which shows the 
relationship between the composite SAT score band (SAT critical reading + mathematics + writing) with 
mean FYGPA at different levels of HSGPA. For each level of HSGPA, higher SAT score bands are 
associated with higher mean FYGPAs. This demonstrates the added value of the SAT above HSGPA in 
predicting FYGPA. As an example, consider the students with a HSGPA in the “A” range. Those with an 
SAT composite score between 600 and 1190 had an average FYGPA of 2.5. However, those same A 
students with an SAT score between 2100 and 2400 had an average FYGPA of 3.6. When considering 
applicants with the same HSGPA, it is clear that the added information of a student’s SAT score(s) can 
provide much more detail on how that student would be expected to perform at an institution. 
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Figure 2. Incremental validity of the SAT: Mean FYGPA by SAT score band controlling for 
HSGPA. 12 
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Note: SAT score bands are based on the sum of SAT-CR, SAT-M, and SAT-W. HSGPA ranges were defined as: 

• “A” range: 4.33 (A+), 4.00(A), and 3.67 (A-); 

• “B” range: 3.33 (B+), 3.00 (B), and 2.67 (B-); and 

• “C or Lower” range: 2.33 (C+) or lower. 


Another way to think about the added utility of the SAT over and above HSGPA to predict FYGPA is by 
examining the amount of error in the prediction of FYGPA by HSGPA alone, by SAT scores alone, or with 
HSGPA and SAT scores together, particularly for students with highly discrepant HSGPAs and SAT 
scores (much stronger HSGPA than SAT scores or vice versa, after the measures have been standardized). 
Previous research 13 ’ 14 has found that about 16%-18% of students would be considered highly discrepant 
favoring their HSGPA, 16%-18% would be considered highly discrepant favoring their SAT scores, and 
about 65%-68% would be considered nondiscrepant. A recent study 15 of more than 150,000 first-year 
students attending 110 four-year institutions found that using students’ HSGPAs without their SAT 
scores to predict their FYGPA for admission would likely result in those students with much higher SAT 
scores than HSGPAs (discrepant favoring SAT) not being admitted, though they would have performed 
just as well in college as the admitted students with much higher HSGPAs than SAT scores. In other 
words, without SAT score information, there is a sizeable percentage of students who would be overlooked 
for admission to an institution when they could have been quite successful there Essentially all 
differential prediction research conducted on the SAT and HSGPA with FYGPA has supported the fact 





words, without SAT score information, there is a sizeable percentage of students who would be overlooked 
for admission to an institution when they could have been quite successful there Essentially all 
differential prediction research conducted on the SAT and HSGPA with FYGPA has supported the fact 
that using the students’ HSGPAs in conjunction with their SAT scores results in the smallest amount of 
error in the prediction of FYGPA across all students. 16 

Cumulative GPA 

It is a commonly heard misunderstanding that the SAT does not predict anything more than FYGPA. 
Perhaps many people would be surprised to learn that the SAT remains similarly, if not slightly more, 
predictive of cumulative GPA through four years of college. Other large-scale studies and meta-analyses 
(aggregating multiple studies on the topic) provide strong support for the notion that the predictive 
validity of test scores such as the SAT are not limited to near-term outcomes such as FYGPA but predict 
longer-term academic and career outcomes, as well. 17 Figure 3 displays the correlations of SAT, HSGPA, 
and the combination of SAT and HSGPA with cumulative GPA through the fourth year of college for the 
2006 entering college cohort. The figure clearly shows that both SAT scores and HSGPA are strong 
predictors of cumulative GPA, with correlations in the mid .50s through the four years of college. 18 In 
addition, the SAT continues to provide incremental value in the prediction of cumulative GPA over 
HSGPA, as evidenced by the fact that the green trend line in the graph is higher than the purple HSGPA 
trend line. The correlations in the figure actually appear to increase over time with a small dip for year 
four. iv 


iv. The sample changed slightly over years, which could explain the differences in results. Of the original 110 institutions that 
provided college performance data on the 2006 cohort, 66 provided second-year data, 60 provided third-year data, and 55 
provided fourth-year data. 
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Figure 3. Correlations of HSGPA and SAT with cumulative GPA (2006 cohort, years 1-4). 19 



Cumulative GPA 


English Course Grades 

SAT scores are also related to performance in specific college courses. 20 This is particularly true in 
instances where the content of the college course is aligned with the content tested on the SAT (e.g., the 
SAT writing section with English course grades and the SAT mathematics section with mathematics 
course grades). Figure 4 depicts the positive linear relationship between SAT critical reading and writing 
scores and English course grades in the first year of college. You can see that those students with the 
highest SAT critical reading and writing scores (700-800 range) earned English course grades that were 
almost a whole letter grade higher than those of students with the lowest SAT scores (200-290). In 
addition, while only about half of the students in the lowest SAT score band in SAT critical reading or 
writing earned a B or higher in English, more than 90% of students in the highest SAT critical reading or 
writing score band earned a B or higher in English. 
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Figure 4. The relationship between SAT critical reading and writing scores and first-year 
English grades. 21 
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Math Course Grades 

Similar to English course grades, there is a positive relationship between SAT mathematics scores and 
mathematics course grades in the first year of college. 22 Figure 5 depicts the average mathematics course 
grade by SAT score band as well as the percentage of students earning a B or higher in their first-year 
mathematics courses by SAT score band. You can see that while students in the highest SAT 
mathematics score band (700-800) earned an average mathematics course grade of a B+ (3.31) in their 
first year, those students in the two lowest SAT score bands (200-390) earned an average mathematics 
course grade below a C (1.92). Also shown in Figure 5, 78% of those students in highest SAT mathematics 
score band earned a B or higher in their first-year mathematics courses, while only 32% of the students in 
the lowest SAT mathematics score band earned a B or higher. 
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Figure 5. The relationship between SAT math scores and first-year mathematics grades. 23 
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Retention 

Because college retention is so highly related to college completion, 24 it is useful to have tools and 
measures that relate to and help us better understand the likelihood that a student will be retained at an 
institution. The following analyses and figures depict the strong relationship between the SAT and 
retention to the second year of college. Figure 6 shows the second-year retention rates by SAT score band 
for the 2006 through 2010 entering college cohorts. This figure clearly shows that students with higher 
SAT scores have higher second-year retention rates, and this is consistent across the five cohorts 
examined. 25 Students in the top SAT score band (2100—2400), for example, have second-year retention 
rates in the 90% range, whereas students in the bottom SAT score band (600-890) have second-year 
retention rates in the 60% range. Also note that the percentages of students returning to an institution by 
SAT score band are stable across cohorts. 
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Figure 6. Retention to year 2 by SAT (2006-2010 cohorts). 26 
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Similar to the SAT validity evidence pertaining to GPA, it is of great interest to understand the added 
value of SAT scores above and beyond HSGPA as they relate to retention rates within an institution. 
Figure 7 shows the mean retention rate by SAT score band, controlling for HSGPA, for the 2006 through 
2010 entering college cohorts. Within each cohort year, higher SAT scores are associated with higher 
retention rates. 27 In addition, even for those students within the same HSGPA level, higher SAT scores 
are associated with higher second-year retention rates. An examination of students with an HSGPA of A 
in the 2010 cohort shows that retention rates increased as SAT score band increased. Students with an 
HSGPA of A and an SAT score of 890 or lower had a mean retention rate of 55%, while those with an SAT 
score of 2100 or higher had a mean retention rate of 96%. 
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Figure 7. Retention to year 2 by SAT and HSGPA (2006-2010 cohorts). 28 
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Graduation 

Examining and establishing the relationship between the SAT and retention through college and, 
ultimately, the relationship between the SAT and college completion, is of great import and value to 
colleges and universities. Figure 8 presents the second-, third-, and fourth-year retention rates and four- 
year graduation rates by SAT score band for the 2006 entering college cohort. The results show that 
higher SAT scores are associated with higher retention rates throughout each year of college as well as 
with higher four-year graduation rates. 29 As time passed in the college experience, the percentage of 
students retained decreased. However, students with higher SAT scores had higher retention rates. For 
example, students with an SAT score of 2100 or higher had a four-year completion rate (from the same 
institution) of 75%, while those with an SAT score of less than 900 had a 20% rate of completion in four 
years (from the same institution). 
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Figure 8. Retention through four-year graduation by SAT (2006 cohort). 30 



In addition, a recent study 31 examined the utility of traditional admission measures in predicting college 
graduation within four years and found that both SAT scores and HSGPA are indeed predictive of this 
outcome. This study modeled the relationship between SAT scores and HSPGA with four-year graduation, 
and the results confirmed that including both SAT scores and HSGPA in the model resulted in better 
prediction than a model that included only SAT scores or only HSGPA. Figure 9 depicts the model-based, 
expected four-year graduation rates by different SAT scores and HSGPAs. You can see that within 
HSGPA, as SAT scores increase, so too does the likelihood of graduation in four years. Note that students 
with a HSGPA of B (3.00) and a composite SAT score of 1200 are expected to have a 35% probability of 
graduating in four years, compared to a 57% probability of graduating for students with the same HSGPA 
but a composite SAT score of 2100. 
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Figure 9. Expected four-year graduation rates by SAT and HSGPA. 32 



HSGPA 

Additional research 33 has examined the relationship between four- and six-year graduation rates for 
students who met and did not meet the SAT College Readiness Benchmark of 1550, representing a 65% 
probability of obtaining a FYGPA of a B- (2.67) or higher. This analysis showed the clear relationship 
between the SAT benchmark and college completion. There were two samples analyzed in this study — 
one examining four-year graduation rates and one examining six-year rates. For the four-year graduation 
rate sample, 58% of students meeting or exceeding the SAT benchmark of 1550 graduated within four 
years, compared to 31% of the students who did not meet the benchmark. For the six-year graduation rate 
sample, 69% of the students meeting or exceeding the SAT benchmark of 1550 graduated within six years, 
compared to 45% of students who were not considered college ready. 


Validity Evidence Related to Test Content 

The SAT tests the critical reading, mathematical, and writing skills that students have developed over 
time and that they need to be successful in college. The College Board regularly studies state standards, 
district curriculum frameworks, and the course content of first-year college courses to ensure that the 
SAT does indeed measure and reflect the content knowledge and cognitive processes that students need to 
be ready for — and successful in — college. 


Evidence for the relationship between the SAT critical reading and writing sections and school curriculum 
and instruction is derived from the strong link found between the skills assessed on the SAT with the 
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curricula reflected in results from a large-scale national survey. 34 Evidence for the connection between the 
SAT mathematics section and school curriculum and instruction was derived from a common set of 
standards in the field of mathematics education. More recently, the College Board surveyed more than 
5,000 high school and college instructors in English Language Arts (ELA) and mathematics to assess the 
knowledge, skills, and topics taught in high school classrooms and the value placed on these topics in 
higher education. 35 The survey results demonstrated strong support for the ELA and mathematics topics 
assessed on the SAT. Instructors rated the vast majority of topics on the SAT as both important and 
covered in their classrooms. 

In addition, the development of each of the three SAT sections (critical reading, writing, and 
mathematics) is guided by the work of a test development committee composed of both high school and 
college teachers in that subject area. These educators review and discuss each new form of the test. These 
reviews are done both by mail and at the site of the committee meeting. The pre-meeting reviews allow 
for deep consideration and reflection on each question and the test as a whole, plus an opportunity for a 
reviewer to check a reference or to make sure that no wrong answer on a multiple-choice question can be 
successfully defended as correct. The concerns identified during the reviews by committee members are 
discussed in the committee meeting with College Board staff and test developers. Each concern must be 
resolved before the test moves into production and printing for its scheduled administration. 

The College Board is currently in the process of redesigning the SAT in order to provide the higher 
education community with a more comprehensive and informative understanding of students’ readiness 
for college-level work, to more clearly and transparently focus on the knowledge, skills, and 
understandings that students need to be successful in college and careers, and to improve the links and 
connections between assessment and instruction by better reflecting the meaningful, engaging, and 
rigorous work that students must undertake in the best high school courses being taught today. 36 The 
redesigned exam will be introduced in March 2016, and the College Board will maintain and improve the 
high level of technical quality of the SAT as well as its rigorous validity research agenda. 

Notably, the redesigned SAT Evidence-Based Reading and Writing and (optional) Essay portions 
will incorporate key design elements supported by evidence 37 , including: 

• The use of a range of text complexity aligned to college- and career-ready reading levels; 

• An emphasis on the use of evidence and source analysis; 

• The incorporation of data and informational graphics that students will analyze along with text; 

• A focus on relevant words in context and on word choice for rhetorical effect; 
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• Attention to a core set of important English language conventions and to effective written 
expression; and 

• The requirement that students interact with texts across a broad range of disciplines. 

The key evidence-based design elements that will be incorporated into the redesigned SAT Math Test 38 
include: 

• A focus on the content that matters most for college and career readiness (rather than a vast array 
of concepts); 

• An emphasis on problem solving and data analysis; and 

• The inclusion of a calculator as well as no-calculator section and attention to the use of the 
calculator as a tool. 

The College Board is committed to ensuring that the content and format of the redesigned SAT are clear 
and transparent and that the exam reflects the best of classroom work and work outside the classroom. 39 

Attention to Fairness 

The Standards for Educational and Psychological Testing 40 point out that, “Ultimately, the validity of an 
intended interpretation of test scores relies on all available evidence relevant to the technical quality of a 
testing system” (p. 17). While this primer will not provide a detailed review of the technical qualities of 
the SAT (additional information can be accessed at www.collegeboard.org) , we will highlight the attention 
paid to fairness for all examinees. 

First, it’s important to note that every item used in an SAT form has been previously pretested and 
reviewed. Pretesting can serve to ensure that items are not ambiguous or confusing, to examine the item 
responses to determine the difficulty level or the degree to which the item differentiates between more or 
less able students, and understand whether students from different racial/ethnic groups or gender groups 
respond to the item differently (also called differential item functioning). Differential item functioning 
(DIF) analyses compare the item performance of two groups of test-takers (e.g., males versus females) 
who have been matched on ability. Items displaying DIF indicate that the item functions in a different 
way for one subgroup than it does for another. Items with sizeable DIF, favoring one group over another, 
will then undergo further review to determine whether the item should be revised and re-pretested or 
eliminated altogether. 


Many critics of tests and testing incorrectly presume that the existence of mean score differences by 
subgroups indicates that the test or measure is biased. Although attention should be paid to consistent 
group mean differences, these differences do not necessarily signal bias. Groups may have different 
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experiences, opportunities, or interests in particular areas, which can impact performance on the skills or 
abilities being measured . 41 Many studies have found that the mean subgroup differences found on the 
SAT (e.g., by gender, race/ethnicity, socioeconomic status) are unfortunately also found in virtually all 
measures of educational outcomes, including other large-scale standardized tests , 42 high school 
performance and graduation , 43 and college attendance . 44 

More specifically, critics of the SAT claim it is biased against underrepresented minority students and 
measures nothing more than socioeconomic status. Substantial evidence refutes both these claims. 
Regarding bias against underrepresented minority students, one would expect that if the SAT were 
biased against African American, American Indian, or Hispanic students, for example, it would 
underpredict their college performance. In other words, this accusation would presume that 
underrepresented minority students would perform much better in college than their SAT scores predict 
that they would and that the SAT would act as a barrier in their college admission process. In reality, 
however, underrepresented minority students tend to earn slightly lower grades in college than predicted 
by their SAT scores. This finding is consistent across cohorts and in later outcomes such as second-, 
third-, and fourth-year cumulative GPA . 45 

Although there is a relationship between socioeconomic status and most educational measures , 46 ’ 47 it is 
not true that the SAT is merely a measure of a student’s wealth. Professor Paul Sackett and his 
colleagues at the University of Minnesota have studied this issue extensively . 48 - 49 They consistently find 
that across multiple samples, the relationship between SAT scores and college grades remains relatively 
unaffected after controlling for the influence of socioeconomic status. In other words, the relationship 
between SAT scores and college grades is largely entirely independent of a student’s socioeconomic status. 

Conclusion 

This document represents a summary of much of the recent validity research on the SAT. In particular, 
the relationships between SAT scores and college grades, retention, and graduation are highlighted and 
described. Information regarding the content on the SAT and a focus on the test’s fairness are explained. 
Being familiar with and able to cite much of this national SAT validity research can help individuals 
refute uninformed criticisms of the test and provide the public with a deeper understanding of the SAT 
and its strengths as an educational tool. 
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